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Abstract 

This  paper  analyzes  the  performance  of  a  novel  heuristic  to 
obtain  the  minimal -length  tour  of  N  given  points  in  the  plane: 
they  are  sequenced  as  they  appear  along  a  spacefilling  curve. 
The  algorithm  consists  essentially  of  sorting,  so  It  is  easily 
coded  and  requires  only  0(N)  memory  and  0(N  log  N)  operations. 
Its  performance  Is  shown  to  be  competitive  with  that  of  other 
available  methods,  v 
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1.  Introduction.  The  travelling  salesman  problem  (TSP)  is 
to  construct  a  circuit  of  minimum  total  length  that  visits  each 
of  N  given  points.  Even  in  the  plane,  this  problem  is  NP- 
complete  [10].  Thus,  instances  of  practical  interest  cannot  be 
solved  exactly  in  reasonable  time.  Accordingly,  attention 
has  focussed  on  fast  algorithms  that  generate  good  but  not 
necessarily  optimal  tours. 

The  authors  have  recently  introduced  a  practical  approach  of 
appealing  simplicity  to  this  problem  [1-3].  It  is  exceptionally 
well  suited  to  manual  execution  (routes  may  be  generated  by 
nontechnical  personnel  without  a  computer  and  even,  after  an 
initial  setup,  without  a  map  [2]),  and  consequently,  it  has  been 
adopted  by  a  variety  of  commercial,  charitable  and  public 
organizations  to  generate  daily  delivery  routes.  It  is  based  on 
a  spacefilling  curve  t/>,  a  continuous  mapping  from  the  unit 

interval  C  =  [0.1]  onto  the  unit  square  S  *  [0,1]2,  and 

is  performed  as  follows: 

SPACEFILLING  HEURISTIC. 

1  )  For  each  point  peS  to  be  visited,  compute  a  0eC  such 
that  p  »  (8). 

2)  Sort  the  points  by  their  corresponding  0's. 

In  other  words,  this  heuristic  visits  points  in  sequence  of  their 
appearance  along  the  spacefilling  curve. 

Our  work  was  inspired  by  Karp  [8],  who  introduced  a  family  of 
0(N  log  N)  algorithms  to  construct  tours  of  length  arbitrarily 
close  to  optimal.  (The  effort  grows  rapidly  as  optimality  is 
approached,  however!)  Karp's  algorithms  divide  S  into  rectan¬ 
gles  sufficiently  small  so  that  each  contains  a  given  number,  say 
t,  of  points.  A  routing  problem  is  solved  exactly  within  each 
rectangle,  and  the  subtours  are  patched  together.  Our  heuristic 
may  be  viewed  as  a  limiting  case  of  Karp's  algorithms:  the 
square  is  subdivided  into  subsquares  that  each  contain  but  one 
point,  and  the  patching  procedure  for  joining  subrectangles  is 
predetermined  and  specified  by  the  spacefilling  curve. 


The  spacefilling  heuristic  is  of  special  interest  because  it  is 
based  on  spacefilling  curves.  Originally  devised  as  topological 
counterexamples  nearly  a  century  ago,  these  were  long  regarded 
as  “mathematical  monstrosities".  It  is  only  recently  that  their 
usefulness  has  been  recognized  [11].  Our  work  represents  the 
first  application  of  spacefilling  curves  to  combinatorial 
optimization. 

The  spacefilling  heuristic  is  appealing  due  to  its  ease  of 
execution.  But  it  is  necessary  to  show  that  it  also  performs 
well,  and  moreover,  is  competitive  with  other  methods.  Standard 
combinatorial  arguments  are  inappropriate  because  they  rely  on 
the  combinatorial  structure  of  the  problem  and  spacefilling 
curves,  by  their  very  nature,  eliminate  this  structure.  Our 
analysis  utilizes  properties  of  measure -preserving  transfor¬ 
mations,  metric  spaces,  and  convexity. 

This  paper  establishes  worst-case  bounds  on  the  heuristic  tour 
length  (Theorem  3)  and  the  ratio  of  heuristic  to  optimal  tour 
lengths  (Theorem  4),  and  almost  sure  bounds  (for  increasingly 
large  random  point  sets)  on  the  heuristic  tour  length  (Theorem 
5.3)  and  the  length  of  the  longest  link  along  the  heuristic  tour 
(Theorem  5.1).  To  streamline  the  presentation,  we  provide  the 
analysis  for  a  specific  curve  in  the  plane;  following  [3],  our 
methods  can  be  generalized  to  the  TSP  in  d-space,  and  to  more 
general  combinatorial  problems,  such  as  matching  and  clustering. 

Table  1  summarizes  our  performance  analysis  of  the  spacefilling 
heuristic.  Also  included  are  the  performances  of  comparable 
methods  cited  by  Bentley  [5]  as  particularly  simple.  These  are: 

N  ear est  N  e  i  ghbor  (NN).  Start  at  an  arbitrary  point  and 
successively  visit  the  nearest  unvisited  point.  After  all  points 
have  been  visited,  return  to  the  start. 

Minimum  Spanning  Tree  (MST).  Construct  the  minimum  span¬ 
ning  tree  of  the  point  set  and  duplicate  all  the  links  of  the 
tree.  Sequence  the  points  as  they  would  appear  in  a  traversal  of 
the  doubled  tree.  Pass  through  the  sequence  and  remove  all  repre¬ 
sentations  after  the  first  of  each  point. 

Strip.  Partition  the  square  into  vertical 

strips.  Visit  the  points  in  each  strip  in  order  (alternately  top- 
to-bottom  and  bottom-to-top)  and  visit  the  strips  from  left  to 
right.  Return  to  the  starting  point. 

We  know  of  no  rigorous  statistical  study  of  the  expected  tour 
lengths  for  the  comparison  heuristics,  but  our  informal  tests 
indicate  practically  identical  behavior  among  these  algorithms 
for  large  problems  consisting  of  uniformly  distributed  points  in 
the  square. 


Memory 


Spacefilling 


Worst-case  effort 
To  solve 
To  modify 


0(N2)* 

Re-solve 


0(N2)* 

Re-solve* 


0(N  log  N)  OCN  log  N) 
O(iog  N)  OClog  N) 


Worst-case  ratio 
Bound 

Known 


O(log  N) 
log  N 
lloglog  N 


OClog  N) 


Longest  tour 


2.15 


3.04 


2  42 


Performance  on 
nonuniform  data 


Good 


Good 


Good 


lEase  of  coding 


Good 


Good 


Good 


Across  a  spectrum  of  criteria,  the  spacefilling  heuristic  is 
comparable  to  or  better  than  other  commonly  considered  heuristics 
for  the  TSP.  Unlike  these  heuristics,  however,  the  spacefilling 
heuristic  may  be  modified  in  the  spirit  of  Karp  to  produce  tours 
arbitrarily  close  to  optimal,  as  follows: 

ARBITRARILY  CLOSE  SPACEFILLING  HEURISTIC. 

ID  Perform  the  spacefilling  heuristic,  and  let  p,...,p 

l  N 

represent  the  points  sequenced  according  to  the  heuristic  tour. 

2)  For  i-1.  t  +  1.  2 1 ♦ 1 .  determine  the  shortest  path 

starting  at  pt  ,  passing  through  pi+j . p  _2  i  n 

any  sequence,  and  ending  at  p  ^  adjust  the  heuristic 
tour  accordingly. 

This  algorithm  requires  0(N  log  N  N  2l)  effort.  The  analy¬ 
tical  techniques  in  this  paper  and  in  [4]  and  [8]  can  be  applied 
to  show  that  it  produces  tours  approaching  optimal  length  as  N>>t 
and  t  -»  oo . 

Halton  and  Terada  [7]  have  devised  a  partitioning  heuristic 
for  which  they  make  impressive  performance  claims.  The  arbitrar¬ 
ily  close  spacefilling  heuristic  can  match  this  performance  by 
solving  a  sequence  of  N-point  problems,  N-1,2,...,  with  t  a 
function  of  N  such  as  t  ■  log  log  log  log  N.  The  sort  may  be 
performed  in  asymptotically  linear  time  (e.g.,  by  BINSORT).  The 
tours  are  asymptotically  optimal,  yet  the  effort  is  almost 
linear,  that  is,  0(N  log  log  log  N)  a.s.  Of  course,  the  benefits 
of  asymptotic  optimality  will  not  become  evident  until  N  >> 


Fundamental  properties  of  the  particular  spacefilling  curve 
used  in  this  study  are  given  in  Section  2.  Various  performance 
bounds  are  derived  in  the  next  three  sections. 

2.  Fundamental  properties  of  the  spacefilling  curve. 

The  particular  spacefilling  curve  upon  which  we  base  our  analysis 
is  the  limit  of  the  sequence  of  curves  shown  in  Figure  1.  The  0 
required  in  step  I  of  the  spacefilling  heuristic  is  evaluated 
according  to  the  function  THETA,  given  in  Appendix  A. 


Figure  1.  Successive  approximations  to  the  spacefilling  curve  yp. 


We  first  consider  the  computational  effort  required  to  perform 
the  spacefilling  heuristic  when  THETA  is  used  to  implement  step 
1.  If  the  arguments  X  and  Y  of  THETA  are  integer  multiples  of 

2~k  (that  is,  if  they  are  given  to  k  binary  digits),  then 
THETA  will  call  itself  k  times.  Therefore  evaluating  THETA  (step 
1  of  the  heuristic)  requires  an  effort  that  depends  on  k  but  not 
N;  it  consists  of  bit  sampling  and  shifting,  and  may  be  arranged 
to  require  0(k)  bit  operations.  Furthermore,  the  value  returned 
by  THETA  will  be  an  integer  multiple  of  2-2k-z  (that  is,  it 
will  be  given  to  2k  +  2  digits).  So  each  comparison  of  9  values 
in  the  sort  (step  2  of  the  heuristic)  requires  an  effort  0(k) 
that  does  not  depend  on  N.  Consequently  we  have: 

PROPOSITION  2.1.  The  spacefilling  heuristic  requires 
0 (N  log  N)  effort . 

REMARK.  If  a  sorting  procedure  such  as  BINSORT  [9,13]  is  used, 
then  the  heuristic  may  be  performed  in  linear  expected  time. 

PROPOSITION  2.2.  The  spacefilling  heuristic  requires  0(N J 
memory. 

Since  the  heuristic  tour  is  simply  a  sorted  list,  every  subse¬ 
quence  of  the  heuristic  tour  is  itself  a  heuristic  tour.  Thus,  if 
the  set  of  points  to  be  visited  changes  slightly,  the  current 
solution  need  be  modified  only  locally  to  produce  a  new  heuristic 
tour.  Thi3  observation  has  important  practical  consequences  in 
applications  where  routes  must  be  updated  frequently  [2].  It  is 
formally  stated  as  follows: 

PROPOSITION  2.3.  I nserting  a  point  into  or  deleting  a  point 
from  an  N- point  heuristic  tour  requires  Of  log  N)  effort. 
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Next  we  establish  three  fundamental  properties  of  the  curve 
which  will  be  required  in  subsequent  sections.  The  first  two  are 
evident  from  Figure  1. 

LEMMA  2.4.  For  any  integers  k>0  and  0  <  i  <  2k  , 

the  set 

{0(0)  1  i  2"k  <  0  s.  (i+l)  2"k) 

is  an  isosceles  right  triangle  whose  right  angle  lies  at 
0((l*O.5)2”k). 

LEMMA  2.5.  The  mapping  0  is  measure  preserving.  That 
is ,  f or  any  interval  I  in  C , 

area{0(I)}  -  length{I}. 

Lemma  2.5  is  important  throughout  Section  5,  where  we  consider 
random  points  uniformly  distributed  on  S.  These  points  are 
unlikely  to  have  finite  binary  representations,  and  an  infinite- 
precision  version  of  THETA  must  be  imagined  to  generate  the  0's 
to  which  they  correspond.  Since  0  is  measure  preserving,  these 
0‘s  will  be  uniformly  distributed  on  C  and  almost  surely  unique¬ 
ly  determined  (since  the  set  of  points  to  which  many  0's  corres¬ 
pond  has  measure  zero). 

The  final  property  of  0  to  be  considered  here  expresses  the 
notion  that  a  spacefilling  curve  preserves  nearness;  points  close 
together  in  C  map  (via  0)  onto  points  close  together  in  S.  We 
take  the  measure  of  “nearness"  on  the  square  S  to  be  Euclidean 
distance,  denoted  by  D  { - ,  -  ] .  As  is  evident  in  Figure  1,  we 
can  view  C  as  a  circuit  since  0(0)  -  0(1):  thus  the  natural 

metric  on  C  is 

A[0,0']  -  min{|0-0'l,  1-|0-0'|>. 

The  following  lemma  is  implicit  in  [11,  p.  65]  although  no  proof 
is  given. 


LEMMA  2.6.  For  any  0,0*eC, 

D2[0(0).  0(0')]  <  4  A  [  9 ,  0’]. 

PROOF.  Assume  without  loss  of  generality  that  0  <  0'-0  <  0.5. 
Let  0ljk-max(0,  min(0‘,  i  2-1  j)  and 


First  suppose  Cl:  &'>  i  2  and  9  <  (i+1)  2  .  Then  by  Lemma  2.4, 

91<k  and  Qi+jjt  both  lie  within  an  isosceles  right 


triangle  whose  right  angle  lies  at  9, 


C i  -0.5)  2“V 


By  Lemma  2.5,  the  area  oi  this  triangle  is  2'".  Since  any 

distance  within  a  right  triangle  cannot  exceed  the  length  of  the 
hypotenuse, 


D2Meu),  <f>(e,+u)]  <  4  •  2'k. 


But  if  Cl  does  not  hold,  then  91(k  =  0,+1>k  ,  which 
also  implies  (2.1).  So  (2.1)  holds  for  all  i,k. 

Assume  further  C2:  0  <  81>k  <  9*.  The  Pythagorean 

Theorem  (cosine  law)  yields 


D2(if»(0jj,),  4.(0, *u)] 

<  D2[4i(8u),  i)i(0u)]  ♦  D2[ifi(01)k),  *(0|+u)] 

“  D2[l/l(02jik  +  j),  *^(021  +1^*1^  +  ^  ('/'(021  +  lJi+l^'  +  +  (2.2) 

But  if  C2  does  not  hold,  then  9  =  02u+i  =  02i+u+i  or  0  =  02i+u+i=  02i+2ji+i' 
which  also  implies  (2.2).  So  (2.2)  holds  for  all  i,k. 

Clearly,  Q0  -  D2  t  4»  (  9  )  ,  i£(9')].  By  (2.2)  and  the 

definition  of  Qk, 

Q0  <  Q,  <  ...  <  Qk.  (2.3) 

And  by  (2.1)  and  the  definition  of  Qk, 

Qk  <  (f2k0’l-l2k0J)  •  4  •  2"k  <  4  (8'-0)  8  •  2“k.  (2.4) 


The  limit  as  k-*®  of  (2.3)-(2.4)  yields  the  asserted  inequality. 


3.  Worst-case  heuristic  tour  length.  We  now  show  that  the 
spacefilling  heuristic  cannot  produce  a  very  long  tour. 


THEOREM  3.  The  heuristic  tour  length  cannot  exceed 

PROOF:  Let  0j .....  0M  be  the  sorted  list  generated  by  the 

spacefilling  heuristic,  and  set  Aj-0^,-6,,  i-1 . N-l,  and  AN»  1  *0,- 0N. 

By  Lemma  2.6, 


Heuristic  tour  length  <  £  2  ,JA^  . 


(3.1) 


But  At  >0  and  £  A,  -  1.  So  the  bound  (3.1),  a 
1=1 

function  of  { Aj  ) ,  achieves  its  maximum 


symmetric  concave 
at  Aj  =  1  /  N  .  ■ 


4.  Worst-case  ratio  of  heuristic  to  optimal  tour  lengths. 
Although  Theorem  3  guarantees  that  the  heuristic  tour  cannot  be 
very  long,  the  optimal  tour  could  be  considerably  shorter.  The 
worst-case  instance  we  have  found  produces  a  heuristic  tour  that 
is  4.707  times  longer  than  optimal,  and  we  conjecture  that  the 
heuristic  tour  length  will  never  exceed  the  optimal  tour  length 
by  more  than  a  constant  factor.  But  the  strongest  result  we  have 
proved  is  the  following. 


THEOREM  4. 


Heuristic  tour  length 
Optimal  tour  length 


OClog  N). 


PROOF.  Let  IT  be  the  set  of  N  points  (in  S)  to  be  visited.  If 
X, .....  \N  denote  the  N  link  lengths  along  the  heuristic 
tour,  and  H(t)  -  re  { k  I  Xk>t)  (where  £?{•}  denotes  cardinality), 
then  the  heuristic  tour  length  L  may  be  written 


L  -  2  Xk  -  £  f"  l{Xk>t}  dt  -  f®  H(t)  dt,  (4.1) 

where  1{»>  denotes  the  indicator  function.  To  establish  the 

claim,  we  will  derive  upper  bounds  on  H(t). 

The  principal  bound  is  derived  from  "Minkowski's  sausage"  [11], 
denoted  by  T(e),  and  defined  to  be  the  set  of  points  (in  the 

plane)  that  lie  within  e  of  at  least  one  point  in  the  locus  of 

the  optimal  tour  (the  union  of  points  along  the  N  segments  that 

form  the  tour).  By  a  simple  geometric  argument,  there  are 
c,c’>0  such  that 


area{T(e)>  <  ceL*  ♦  c'e2,  for  all  IT  and  all  e>0. 


(4.2) 
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where  L*  is  the  length  of  the  optimal  tour  through  TT.  (The 

first  term  of  (4.2)  represents  asymptotic  behavior  as  e-*0  and 

T(e)  becomes  a  ribbon  of  length  L*  and  width  2e;  the  second 
term  represents  asymptotic  behavior  as  e-*co  and  T(e)  becomes  a 
circle  of  radius  e  . ) 

Now  for  arbitrary  t>0,  let  m  -  [2/tl.  Partition  C  into 

disjoint  intervals  I.  .....  I  2  .  each  of  length  m“2  . 

Q1 

By  Lemma  2.6,  the  points  in  tp  Uk )  lie  within  2/m  of  each 

other,  so 

ip(Ik)  n  TI  is  nonempty  =>  ip(Ik)  £  TT(2/m),  (4.3) 

where  11(e)  denotes  the  set  of  points  (in  the  plane)  that  lie 

within  e  of  at  least  one  point  in  TT.  If  the  distance  between 

any  two  consecutive  points  along  the  heuristic  tour  exceeds  2/m, 
then  the  8's  corresponding  to  these  points  cannot  lie  in  the 

same  interval,  so 

H(t)  <  J»“{k  I  \k>2/m)  <  I  <p(lk)  O  TT  is  nonempty).  (4.4) 

By  Lemma  2.5,  area{vp(Ik))  -  m“2.  Thus  (4.3)- (4.4)  yield 

H(t)  <  m2  area{TI(2/m)}.  (4.5) 

Clearly  TT  ( e )  c  T(e).  So  (4.2),  (4.5),  and  lefinition 

of  m  become 

H(t)  <  2cmL“  ♦  4c'  <  4cL*/t  ♦  2cL*  ♦  4c'.  (4.6) 

Another  bound  on  H(t)  can  be  established  by  noting  that  the 
distance  between  any  two  points  in  IT  cannot  exceed  L*  or 
^2,  so 

H(t)  -  0.  t>min{L\  <2}.  (4.7) 

Furthermore,  since  there  are  only  N  links  in  the  heuristic  tour, 
H(t)  <  N.  (4.8) 

Now  combining  (4.1)  and  (4.6)-(4.8)  gives 

L  <  J  m*ntL  ’  ^  min{N,  4cL*/t  ♦  2cL*  *  4c')  dt 

l*  45  l" 

<  f  min{N,  4cL*/t)  dt  ♦  f  2cL*  dt  ♦  f  4c'  dt. 

J  0  J  0  ”'0 

Evaluating  the  integrals  yields  L/L*  -  0(log  N).  ■ 


5.  Stochastic  analysis.  Let  { p( }  be  an  infinite  sequence 
of  independent  uniformly  distributed  points  in  S,  and  let 

Ln  -  length  of  the  heuristic  tour  through  (pt . pN>, 

L«  -  length  of  the  optimal  tour  through  {p . p  }. 

w  IN 

In  a  classic  work,  Beardwood,  Halton  and  Hammersley  [4]  showed 
that  L^/^N  -*  P*  a.s.  (The  constant  p*  has 

been  experimentally  determined  to  be  0.765.)  The  purpose  of 
this  section  is  to  produce  similar  asymptotic  bounds  on  the 
heuristic  tour  lengths  LN.  We  also  examine  the  length  of  the 
longest  link  along  the  heuristic  tour,  and  show  that  it  grows 
only  slightly  faster  than  the  average  link  length. 

To  begin,  we  note  that  the  nicest  possible  convergence, 
Ln  /  -*  p,  does  not  hold.  We  prove  instead  a  result 

whose  practical  implications  are  the  same,  that  Ln/«JN 
“converges"  to  a  narrow  interval  [p~,p+]  in  the  sense  that 

P”  <  lim  inf  L„/  aIn  and  lim  sup  LN/ ^N  <  P+,  a.s.  (5.1) 

Since  the  optimal  tour  is  no  longer  than  the  heuristic  tour, 
P*  <  P",  and  by  Theorem  3,  P+  <  2,  so  p”  and 

P*  exist.  This  section  will  establish  tight  bounds  on  these 
constants.  Numerical  evaluation  of  the  bounds  shows  that  p” 
and  p*  lie  within  a  range  no  greater  than  0.956  +  0.001. 
Thus,  for  large  N,  the  heuristic  tour  length  will  be  about  25X 

above  optimal. 

Our  analysis  may  be  modified,  as  in  14],  so  it  applies  to 

independent  points  nonuniformly  distributed  in  the  plane.  If  the 

points  have  density  f(x,y),  and  K(f)  =  J  J  ^f(x,y)  dy  dx, 

then  the  optimal  tour  grows  as  K(f)  p*^N  and  the 
heuristic  tour  grows  as  between  K(f)  P"^N  and 
K(f)  P*  ^|N  Thus  the  heuristic  tour  remains  about  25  X 

longer  than  the  optimal  tour.  If  the  points  are  uniformly 
distributed  over  any  region  of  area  A>0,  then  K  C  f )  ” 


This  has  a  useful  consequence;  it  implies  that  the  N/K-th  point 
in  a  tour  will  lie  at  1/K-th  the  tour  length  from  the  start  of 
the  tour,  when  N  is  large.  (To  see  this,  observe  that  N/K  con¬ 
secutive  6's  span  1/K-th  the  length  of  C,  and  constitute  inde¬ 
pendent,  uniformly  distributed  points  over  the  image  under  \f>  of 
that  range.  Since  ip  is  measure  preserving,  that  image  has  area 
1/K.)  An  important  application  is  the  formation  of  delivery 
routes  for  K  vehicles  by  partitioning  the  travelling  salesman 
tour  into  K  segments,  each  to  be  travelled  by  a  single  vehicle 
[2].  All  subtours  contain  equal  numbers  of  points,  but  it  is 
desirable  that  they  be  equally  long.  Partitioning  the  space¬ 
filling  heuristic  tour  produces  routes  which  tend  to  be  of  equal 
length.  This  is  not  true  of  the  optimal  tour,  or  of  tours  formed 
by  other  heuristic  methods. 

Even  the  largest  distance  between  two  consecutive  points  along 
the  heuristic  tour  cannot  greatly  exceed  the  average  interpoint 
distance,  as  the  following  theorem  demonstrates.  Thus,  the 
spacefilling  heuristic  produces  a  tour  whose  performance  is  good 
with  respect  to  the  "bottleneck"  (maximum  link  length)  criterion 
[6]  as  well  as  the  customary  total  tour  length  criterion. 

THEOREM  5.1.  Let  EN  denote  the  length  of  the 
longest  link  in  the  heuristic  tour  of  {p . p  > . 

I  JV 

Then,  for  all  a>l, 

Jlm^  Prob{EN  >  2  a  -Mn  N)/n}  -  0. 

PROOF;  Let  eN  be  the  largest  distance  between  consecutive  0's 
in  the  sorted  list  produced  by  the  spacefilling  heuristic.  In 
light  of  Lemma  2.6,  we  need  only  show  that 

^irn^  P{eN  >  a2  (In  N)/N}  =  0. 

The  distance  between  0’s  is  not  affected  by  linear  shifts 
within  C.  Subtracting  the  smallest  0  from  each  0  in  the  list 
produces  a  0  at  0  (equivalent! v  at  1)  and  a  sorted  list  of  N-l 
independent  uniformly  distributed  0’s  between  0  and  1.  These 
points  determine  N  intervals  of  which  the  largest  has  length 
eN’ 
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We  now  construct  a  new  random  variable  whose  distribution 
coincides  with  that  of  eN.  Let  { t, }  be  an  infinite  sequence 

of  independent  exponentially  distributed  random  variables  of 

N 

mean  1,  and  let  SN  =  £  t, .  That  is,  SN  is  the  time  of 

iTi 

the  N-th  arrival  of  a  unit  intensity  Poisson  process.  Also  let 
Mj,  -  max  tr  If  Sn-T.  then  S1,...,SN_l  will  be  conditionally 

distributed  as  a  sc  ted  list  of  N-I  independent  uniform  random 
variables  over  [ 0 . T ] .  Thus.  M„  /  T  is  conditionally  distributed 
as  eN,  given  SN  *  T.  Integrating  over  T,  this  becomes: 

Mn/Sn  is  distributed  as  ew .  It  remains  to  show  that 

Jin^  P{Mn/Sn  z  a2  (In  N)/N)  -  0. 


Clearly, 

P(Mn/Sn  >  a2  (In  NVN} 

-  P(M„/ln  N  >  a2  Sn/M) 

<  P(MN/ln  N  >  a  or  S„/N  <  1/a) 

<  P(MN/ln  N  >  a)  P{Sn/N  <  1/a). 

Since  MN  is  the  maximum  of  N  independent  exponential  random 
variables, 

P(MN/ln  N  >  a)  -  1  -  (1  -  N'a)N  <  Nl_a, 

where  we  have  used  the  inequality  (l-x)N  >  1  -  Nx,  0  <  x  <  1 . 
This  expression  vanishes  as  N-*oo.  And,  by  the  Law  of  Large 
Numbers,  P(Sn/N  <  1/a)  vanishes  as  well.  ■ 


REMARK.  We  have  observed  that  EN/  4(ln  N)/N  does  not  converge 
as  N  grows,  but  that  it  most  often  lies  between  I.i  and  1.3. 


We  now  return  to  the  bounds  p~  and  p*  in  (5.1).  We  first 
examine  the  relationship  between  the  random  heuristic  tour 
lengths  Ln  and  their  expectations  E{LN). 

LEMMA  5.2.  The  random  sequence  [  -  E{L^)]/4n 

converges  almost  surely  to  zero.  Thus  (5.1)  holds  with 

P~  -  tim  inf  E{LN/  aJN)  P*  -  lim  sup  E{Ln/4N> 

PROOF.  Steele's  proof  [12]  that  [  L„  -  E<L^}]/4N  0  a.s. 

applies  to  the  heuristic  tour  length  as  well.  ■ 

Since  p-  and  p*  are  determined  by  the  deterministic 
sequence  of  expectations  E{Ln)/^/N,  the  remaining 
analysis  will  be  concerned  solely  with  expectations.  We  now  show 
that  p"  and  p*  are  quite  close,  so  that  "for  all  practical 
purposes”  Ln/Wn  converges  almost  surely. 

Define 

m(t)  -  f‘  Dilute),  *(8-t  mod  1)]  d8..  (5.2) 

J  o 

to  be  the  expected  Euclidean  distance  between  two  uniformly 
distributed  points  in  S  whose  corresponding  0‘s  lie  exactly 

GO 

distance  t  apart.  Also  let  g(t)  -  Wt  e_t  and  y(t)  -  S  4k  t  g(4k  t). 

k--a> 

Clearly  y(4t)-y(t).  Furthermore,  y  displays  very  little 
variation; 

y ~  -  min  y(t)  ~  1.275 

(5.3) 

y*  -  max  y(t)  ~  1.281. 

These  numbers  have  the  following  significance. 

THEOREM  5.3.  There  is  a  c  on  st  ant  r*  such  that 


y~  r*  <  P“  <  P*  <  y*  r* 
and  r*  is  arbitrary  closely  determined  by 


T1 


-  t-3/2  m(t)  dt  <  6a, 

*  a 


V  0<a<0.25 


To  establish  Theorem  5.3  we  require  two  lemmas  whose  proofs  are 
given  in  Appendix  B.  The  first  shows  that  m(t)  displays  a 
certain  limiting  behavior  as  t-*0. 

LEMMA  5.4.  There  is  a  continuous  function  r  on  (0,1] 
such  that 

(  a  ]  |m(t)  -  [  r(t)  ♦  4t  ]  |  <  2t3/2. 

(b)  r(t/4)  -  r(t). 

(  c  )  0  <  r(t)  c  2. 

Let  An  be  a  random  variable  whose  distribution  is  the  same 
as  that  of  the  difference  between  two  consecutive  8's  in  the 
sorted  list  of  N  independent  values.  Clearly, 

E{Ln}/4N  =  4N  E{link  length}  *  4n  E{m(AN)>.  (5.4) 

For  large  N,  the  sorted  list  of  8‘s  will  approximate  a  Poisson 
process  on  C,  and  so  the  distribution  of  AN  will  approach  a 
negative  exponential  of  mean  1/N.  Moreover,  since  AN  is  likely 
to  be  small,  the  limiting  form  of  m(0  (given  by  Lemma  5.4a) 
may  be  substituted  into  (5.4).  These  manipulations  are 
summarized  by 

LEMMA  5.5. 

Jim^  ^E{Ln}/4N  -  /"  r(t)  4t  N  e_m  dt }  =  0. 

PROOF  OF  THEOREM  5.3.  Let  r*  -  f4*  r(t)/t  dt;  by 

1  a 

Lemma  5.4b,  this  integral  does  not  depend  on  a.  By  Lemma  5.5, 
E  {  LN  }  /  approaches 

C  r(t)  N  e“m  dt 
*  o 

-  C  r(t)  g(Nt)  N  dt 
J  o 

«  ^k-Ma 

-  2.  f  k  r(t)  g(Nt)  N  dt 

k=-0»  J  4*a 

-  J4*  r(t)  y(Nt)/t  dt  (by  Lemma  5.4b) 

and  the  asserted  result  follows  from  Lemmas  5.2  and  5.4a.  ■ 


REMARK.  These  bounds  may  be  made  even  tighter  by  using  the 
stronger  result  r(t/2)-r(t)  instead  of  Lemma  5.4b.  This  places 

P"  and  p*  within  1  0-5  of  each  other.  Our  estimates, 
given  at  the  start  of  this  section,  are  based  on  these  tighter 
bounds,  as  well  as  Monte-Carlo  simulation  of  (5.2)  to  compute 

r*.  The  statistical  error  of  our  simulation  was  *.  10~3. 
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APPENDIX  A 


AN  ALGORITHM  TO  PERFORM  STEP  1  OF  THE  SPACEFILLING  HEURISTIC 


Let: 


ABS(A)  -  A  if  AaO,  -  -A  if  A<0. 

INT(A)  -  the  largest  integer  not  larger  than  A. 

FRACT(A)  -  A-INT(A). 

MIN(A,B)  -  A  if  A<B,  -  B  if  A>B . 

MOD(A,B)  -  B*FRACT(A/B). 

NV(X,Y)  -  the  'number'  of  vertex  (X,Y)  of  the  unit  square, 
counting  clockwise  from  the  origin,  i.e.,  NV(0,0)-0, 
N V ( 0 , 1) *  1 ,  NV(l.l)-2,  NV(l,0)-3. 

The  algorithm  is  given  as  a  recursive  function: 

FUNCTION  THETA(X.Y): 

If  X-l  and  Y-l  then  RETURN(O.S) 

Q-NV(MIN (INT(2*X),1),  MIN(INT(2*Y),1)))  CQ  identifies  the  quad¬ 

rant  containing  (X.Y)) 

T-THET A(2*ABS (X-0.5),  2«ABS(Y-0.5))  (T  is  the  position  along  the 

subcurve  in  quadrant  Q) 

If  MOD (Q,2)- 1  then  T-l-T 


RETURN(FRACT((Q*T)/4  ♦  7/8)) 


(Visit  the  vertices  of 
a  quadrant  clockwise) 


APPENDIX  B 


PROOFS  OF  LEMMAS  REQUIRED  IN  THEOREM  5.3. 
PROOF  OF  LEMMA  5.4.  Let 

m-(t)  -  f1_t  DM (0).  «K9+t)]  d0 
S  0 


and 


m*(t)  -  m-(t)  ♦  2  t3'2  -  m“(t)  ♦  /J  2  Wl  d0. 

Clearly  m~(t)<m(t),  and  by  Lemma  2.6,  m(t)«m*(t).  Since 

if/  visits  S  by  visiting  four  subsquares  of  S,  each  identical  to  S 
at  half  the  scale  (see  Figure  1), 

m“(t)  <  2  m~(t/4) 

m*(t)  >  2 

So  2k  m~(t/4k)  and  2k  m*(t/4k)  approach  a  common  limit  m*(t)  satisfying 
m*(t)  -  2  m*(t/4)  (B.l) 


and 


lm(t)-m“(t)l  <  2  t3'2.  (B.2) 

By  Lemma  2.6, 

m(t)  <  2  4t,  (B.3) 

and  by  the  triangle  inequality 

m(t+e)  <  m(t)  ♦  m(e)  <  m(t)  ♦  2  41, 

so  m  ( • )  is  continuous.  Now  let  r  ( t )  »  m*(t)/^lt  .  Continuity 
of  r  follows  from  that  of  m;  (a)  follows  from  (B.2);  (b)  follows 
from  (B.l)  and  (c)  follows  from  (B.3).  ■ 


PROOF  OF  LEMMA  5.5.  The  exact  density  of  &N  is 

fN(t)  -  N  (l-t)N"\  0  <  t  <  1, 

so  (5.4)  may  be  written 

E{Ln}/4FJ  -  J*  m(t)  fN(t)  dt. 

Now 

|e{Ln}/4N  -  4N  r(t)  4t  fN(t)  dtj 

<  aIN  |m(t)  -  [r(t)-Vt]|  fN(t)  dt 

<  ('  2  (Nt)3/2  (l-t)”'1  dt  (by  Lemma  5.3a) 

J  0 

<  J1  2  (Nt)3'2  e"w“m  dt 
J  0 

<  /•  2  (Nt)3'2  «-»-•»  dt 

*  Q 

-  0(1/N). 

It  remains  to  show  that  the  following  sequence  vanishes  as  N-*a>, 
|[  j‘  r(t)  4t  fN(t)  dt  ]  -  £  J*  r(t)  Wt  N  e'm  dt  ]  | 

<  2  N3''2  4t  Jd-t)0*-0  -  e_Nt|  dt  (by  Lemma  5.4c) 

<  J*  2  4T  {|max(0,l-r/N)(N'l)  -  e-T|}  dT.  (T-Nt) 

Since  (1-r/N)  converges  upward  to  e-T ,  the  integrand 
of  this  upper  bound  converges  pointwise  to  zero  as  N-*oo,  and  by 
Lebesgue’s  dominated  convergence  theorem,  the  limit  integral 
vanishes  as  well.  ■ 


